We observe that high CERS1 expression is associated with sensitivity to BRD-1748. We use log2-transformed TPM and log2-transformed counts for CERS1 expression and log viability measures, respectively.
We examine the relationship between total aneuploidy and viability of cell lines. With all three aneuploidy metrics, there seems to be no significant correlation between total aneuploidy and viability.
Since the CERS1 gene is found on chromosome 19p, we suspect there may be a relationship between 19p aneuploidy and cell line viability.
We first confirm that across 19p aneuploidy classes, the relationship between high CERS1 expression and sensitivity is intact.
We examine the association of a 19p aneuploidy event on viability following treatment with BRD-1748. Surprisingly, a gain of 19p is associated with resistance. Viabilities of cell lines with loss of 19p are not significantly different than those with normal 19p.
We also compare the differences in CERS1 expression across 19p aneuploidy events. It seems that cell lines with 19p loss have significantly lower CERS1 expression. However, a gain of 19p does not result in significantly higher expression.
This leads us to hypothesize that there is something else in the 19p aneuploidy information, that is not captured solely by the CERS1 expression.
We compare a linear model regressing log viability on CERS1 expression versus a linear model regressing log viability on CERS1 expression and 19p aneuploidy annotation.
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## please use list() instead
##
## # Before:
## funs(name = f(.)
##
## # After:
## list(name = ~f(.))
## This warning is displayed once per session.
Model 1:
##
## Call:
## lm(formula = logviability ~ CERS1, data = viab.aneu.19p.CERS1exp.noNA.formodel)
##
## Coefficients:
## (Intercept) CERS1
## -0.5305 -0.7503
Model 2:
##
## Call:
## lm(formula = logviability ~ CERS1 + aneuploidy19p.num, data = viab.aneu.19p.CERS1exp.noNA.formodel)
##
## Coefficients:
## (Intercept) CERS1 aneuploidy19p.num
## -0.5305 -0.7673 0.1270
We find that including 19p aneuploidy annotations in the model yields a significant improvement over using just CERS1 expression (p-value = 0.0104).
Since the association between 19p aneuploidy, CERS1 expression, and sensitivity are not as simple as 19p aneuploidy affecting CERS1 expression, which determines sensitivity, we hypothesize that there may be some compensatory action from the other CERS genes when there is a 19p aneuploidy event.
We find the locations of the other CERS or CERS-related genes:
The expression pattern of CERS3 is interesting: there are no cell lines with high CERS3 and CERS1 expression. In addition, there are no cell lines with high CERS3 expression that have gain of 19p.
We compare a linear model regressing log viability on CERS1 expression and 19p aneuploidy versus a linear model regressing log viability on CERS1 expression, 19p aneuploidy annotation, and CERS3 expression.
Model 1:
##
## Call:
## lm(formula = logviability ~ CERS1 + aneuploidy19p.num, data = otherCERSexp.viab.aneu.19p.noNA)
##
## Coefficients:
## (Intercept) CERS1 aneuploidy19p.num
## 0.4567 -0.5563 0.2330
Model 2:
##
## Call:
## lm(formula = logviability ~ CERS1 + aneuploidy19p.num + CERS3,
## data = otherCERSexp.viab.aneu.19p.noNA)
##
## Coefficients:
## (Intercept) CERS1 aneuploidy19p.num
## 0.5112 -0.5699 0.2128
## CERS3
## -0.1488
The model that includes CERS3 expression performs very slightly better than the one only including CERS1 expression and 19p aneuploidy.
We include the expression of serine synthesis pathway genes, in addition to the expression of CERS and CERS-related genes and chromosomes 15 and 19p aneuploidy annotations, and perform linear regression on the log viability. The coefficients assigned by the model are shown below.
To improve interpretability and observe interactions between the features, we derive a prediction rule ensemble using the “pre” R package. This model achieves an R^2 value of 0.3795. The rules in the ensemble represented as decision trees are as follows: